Improved likelihood ratio test based voice activity detector applied to speech recognition
نویسندگان
چکیده
Nowadays, the accuracy of speech processing systems is strongly affected by acoustic noise. This is a serious obstacle regarding the demands of modern applications. Therefore, these systems often need a noise reduction algorithm working in combination with a precise voice activity detector (VAD). The computation needed to achieve denoising and speech detection must not exceed the limitations imposed by real time speech processing systems. This paper presents a novel VAD for improving speech detection robustness in noisy environments and the performance of speech recognition systems in real time applications. The algorithm is based on a Multivariate Complex Gaussian (MCG) observation model and defines an optimal likelihood ratio test (LRT) involving multiple and correlated observations (MCO) based on a jointly Gaussian probability distribution (jGpdf) and a symmetric covariance matrix. The complete derivation of the jGpdf-LRT for the general case of a symmetric covariance matrix is shown in terms of the Cholesky decomposition which allows to efficiently compute the VAD decision rule. An extensive analysis of the proposed methodology for a low dimensional observation model demonstrates: (i) the improved robustness of the proposed approach by means of a clear reduction of the classification error as the number of observations is increased, and (ii) the trade-off between the number of observations and the detection performance. The proposed strategy is also compared to different VAD methods including the G.729, AMR and AFE standards, as well as other recently reported algorithms showing a sustained advantage in speech/non-speech detection accuracy and speech recognition performance using the AURORA databases. 2010 Elsevier B.V. All rights reserved.
منابع مشابه
Improved Likelihood Ratio Test Detector Using a Jointly Gaussian Probability Distribution Function
Currently, the accuracy of speech processing systems is strongly affected by the acoustic noise. This is a serious obstacle to meet the demands of modern applications and therefore these systems often needs a noise reduction algorithm working in combination with a precise voice activity detector (VAD). This paper presents a new voice activity detector (VAD) for improving speech detection robust...
متن کاملAn Efficient VAD Based on a Hang-Over Scheme and a Likelihood Ratio Test
The emerging applications of wireless speech communication are demanding increasing levels of performance in noise adverse environments together with the design of high response rate speech processing systems. This is a serious obstacle to meet the demands of modern applications and therefore these systems often needs a noise reduction algorithm working in combination with a precise voice activ...
متن کاملImproved voice activity detection based on a smoothed statistical likelihood ratio
This paper presents the behavioural mechanism of a statistical modelbased voice activity detector (VAD), featuring a likelihood ratio test for the activity decision. From investigation of the VAD, it is found that detection errors could occur frequently at speech offset regions because of the delay term in the decision-directed parameter estimator, employed for the estimation of an unknown para...
متن کاملLikelihood ratio test with complex laplacian model for voice activity detection
This paper proposes a voice activity detector (VAD) based on the complex Laplacian model. With the use of a goodness-of-fit (GOF) test, it is discovered that the Laplacian model is more suitable to describe noisy speech distribution than the conventional Gaussian model. The likelihood ratio (LR) based on the Laplacian model is computed and then applied to the VAD operation. According to the exp...
متن کاملAn Efficient VAD Based on a Generalized Gaussian PDF
The emerging applications of wireless speech communication are demanding increasing levels of performance in noise adverse environments together with the design of high response rate speech processing systems. This is a serious obstacle to meet the demands of modern applications and therefore these systems often needs a noise reduction algorithm working in combination with a precise voice activ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 52 شماره
صفحات -
تاریخ انتشار 2010